AITopics | stochastic ensemble value expansion

Collaborating Authors

stochastic ensemble value expansion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Neural Information Processing SystemsNov-20-2025, 23:12:41 GMT

There is growing interest in combining model-free and model-based approaches in reinforcement learning with the goal of achieving the high performance of model-free algorithms with low sample complexity. This is difficult because an imperfect dynamics model can degrade the performance of the learning algorithm, and in sufficiently complex environments, the dynamics model will always be imperfect. As a result, a key challenge is to combine model-based approaches with model-free learning in such a way that errors in the model do not degrade performance. We propose stochastic ensemble value expansion (STEVE), a novel model-based technique that addresses this issue. By dynamically interpolating between model rollouts of various horizon lengths, STEVE ensures that the model is only utilized when doing so does not introduce significant errors. Our approach outperforms model-free baselines on challenging continuous control benchmarks with an order-of-magnitude increase in sample efficiency.

name change, sample-efficient reinforcement learning, stochastic ensemble value expansion, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.85)
Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Reviews: Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Neural Information Processing SystemsOct-8-2024, 09:11:16 GMT

The main algorithmic idea is a weighted combination of H step temporal differences, estimated on H steps (and rolled out by a learned model of the environment). The underlying idea is to allow the learner to tradeoff between estimation errors in model and Q function in different parts of the state-action space during learning. The updated TD estimator is incorporated into the DDPG algorithm in a straightforward manner. The update is computationally more intensive but the result is improved sample complexity. The experimental results on a variety of continuous control tasks show significant improvement over the baseline DDPG and a related method (MVE) (which is the precursor to this work). Overall, the paper is well written. The empirical results are very promising. The analysis and discussion is a bit limited but is not a major drawback. Overall, there is much to like about the paper.

algorithm, sample-efficient reinforcement learning, stochastic ensemble value expansion, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Buckman, Jacob, Hafner, Danijar, Tucker, George, Brevdo, Eugene, Lee, Honglak

Neural Information Processing SystemsFeb-14-2020, 20:12:34 GMT

model-based approach, sample-efficient reinforcement learning, stochastic ensemble value expansion, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback